Little prince example

This is a nice book for both young and old. It gives beautiful life lessons in a fun way. Definitely worth the money!

+ Educational

+ Funny

+ Price


Nice story for older children.

+ Funny

- Readability

Sentiment

  • Sentiment =

    • Feelings, Attitudes, Emotions, Opinions

    • A thought, view, or attitude, especially one based mainly on emotion instead of reason

  • Subjective impressions, not facts

Webster’s Dictionary

Webster’s Dictionary

Scherer Typology of Affective States

  • Emotion: brief organically synchronized … evaluation of a major event
    • angry, sad, joyful, fearful, ashamed, proud, elated
  • Mood: diffuse non-caused low-intensity long-duration change in subjective feeling
    • cheerful, gloomy, irritable, listless, depressed, buoyant
  • Interpersonal stances: affective stance toward another person in a specific interaction
    • friendly, flirtatious, distant, cold, warm, supportive, contemptuous
  • Attitudes: enduring, affectively colored beliefs, dispositions towards objects or persons
    • liking, loving, hating, valuing, desiring
  • Personality traits: stable personality dispositions and typical behavior tendencies
    • nervous, anxious, reckless, morose, hostile, jealous

Sentiment Analysis

  • Sentiment Analysis

    • Use of natural language processing (NLP) and computational techniques to automate the extraction or classification of sentiment from typically unstructured text
  • Opinion mining

  • Sentiment mining

  • Subjectivity analysis

Sentiment analysis

can be applied in every topic & domain!

  • Book: is this review positive or negative?

  • Humanities:sentiment analysis for German historic plays.

  • Products: what do people think about the new iPhone?

  • Blog: how are people thinking about immigrants?

  • Politics: who is going to win the election?

  • Twitter: what is trend?

  • Movie: is this review positive or negative (IMDB, Netflix)?

  • Marketing: how is consumer confidence? Consumer attitudes?

  • Healthcare: are patients happy with the hospital environment?

Two main types of opinions

(Jindal and Liu 2006; Liu, 2010)

  • Regular opinions: Sentiment/opinion expressions on some target entities

    • Direct opinions:

      • “The touch screen is really cool.”
    • Indirect opinions:

      • “After taking the drug, my pain has gone.”
  • Comparative opinions: Comparison of more than one entity.

    • E.g., “iPhone is better than Blackberry.”

Practical definition

(Hu and Liu 2004; Liu, 2010, 2012)

  • An opinion is a quintuple

    ( entity, aspect, sentiment, holder, time)

    where

    • entity: target entity (or object).

    • Aspect: aspect (or feature) of the entity.

    • Sentiment: +, -, or neu, a rating, or an emotion.

    • holder: opinion holder.

    • time: time when the opinion was expressed.

Example

Kindle Customer Reviewed in the United States on August 16, 2015:

This has been my favorite book since I was 14 and had to read it in French as an assignment in school. I fell in love with it and immediately bought the English translation by Katherine Woods, as I knew I would read it many times over the years and I knew my French was not likely to improve. Today I bought this version to have on my Kindle as I was thinking of giving my 40 year old paperback to my best friend. I could not be more disappointed. The changes in this translation take so much away from the book that it almost changes who the Little Prince really is. The charm of the book is completely missing. In one of my favorite parts of the book the fox talks to the Little Prince, sharing his invaluable truth: “what is essential is invisible to the eye.” Howard changes it to “Anything essential is invisible to the eyes”, which changes the entire concept of what is said. “The eye” is every eye, everywhere. Making it plural takes away the meaning of what the fox is really saying. If you want to read this book, if you want to read it to your children, please take my advice and find the Katherine Woods translation, even if it means going to a used book store. I simply cannot understand what Howard was thinking in all of the changes he made to this wonderful story that will stay with you for a lifetime, but only if you read the Woods translation which will open your eyes to the true meaning of the Little Prince. As the fox says: “Words are the source of misunderstandings” and Howardh has changed the words so much that indeed, in this version, words are very much the source of misunderstandings.

Sentiment Analysis

  • Simplest task:

    • Is the attitude of this text positive or negative?
  • More complex:

    • Rank the attitude of this text from 1 to 5
  • Advanced:

    • Detect the target, source, or complex opinion types
    • Implicit opinions or aspects

Simple task: Opinion summary

Aspect/feature Based Summary of opinions about iPhone:

Aspect: Touch screen
Positive: 212

The touch screen was really cool.
The touch screen was so easy to use and can do amazing things.


Negative: 6

The screen is easily scratched.
I have a lot of difficulty in removing finger marks from the touch screen.


Aspect: Size

Problem

  • Which features to use?

    • Words (unigrams)
    • Phrases/n-grams
    • Sentences
  • How to interpret features for sentiment detection?

    • Bag of words (IR)
    • Annotated lexicons (WordNet, SentiWordNet)
    • Syntactic patterns
    • Paragraph structure

Challenges

  • Harder than topical classification, with which bag of words features perform well

  • Must consider other features due to…
    • Subtlety of sentiment expression
      • irony
      • expression of sentiment using neutral words
    • Domain/context dependence
      • words/phrases can mean different things in different contexts and domains
    • Effect of syntax on semantics

Approaches for Sentiment Analysis

  • Lexicon-based methods (dictionary-based)
    • Using sentiment words and phrases: good, wonderful, awesome, troublesome, cost an arm and leg

    • Not completely unsupervised!

  • Supervised learning methods: to classify reviews into positive and negative.
    • Machine learning
      • Naïve Bayes, Maximum Entropy, Support Vector Machine
    • Recent research
      • Deep learning

Lexicon-based methods

The General Inquirer

LIWC (Linguistic Inquiry and Word Count)

  • Home page: http://www.liwc.net/
  • 2300 words, >70 classes

  • Affective Processes
    • negative emotion (bad, weird, hate, problem, tough)
    • positive emotion (love, nice, sweet)
  • Cognitive Processes
    • Tentative (maybe, perhaps, guess), Inhibition (block, constraint)
  • Pronouns, Negation (no, never), Quantifiers (few, many)



Pennebaker, J.W., Booth, R.J., & Francis, M.E. (2007). Linguistic Inquiry and Word Count: LIWC 2007. Austin, TX

MPQA Subjectivity Cues Lexicon




Theresa Wilson, Janyce Wiebe, and Paul Hoffmann (2005). Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. Proc. of HLT-EMNLP-2005.
Riloff and Wiebe (2003). Learning extraction patterns for subjective expressions. EMNLP-2003.

Bing Liu Opinion Lexicon

SentiWordNet

  • Home page: http://sentiwordnet.isti.cnr.it/

  • All WordNet synsets automatically annotated for degrees of positivity, negativity, and neutrality/objectiveness

  • [estimable(J,3)] “may be computed or estimated”
    \[\operatorname{Pos\ \ 0\ \ \ Neg\ \ 0\ \ \ Obj\ \ 1} \]
  • [estimable(J,1)] “deserving of respect or high regard” \[\operatorname{Pos\ \ .75\ \ \ Neg\ \ 0\ \ \ Obj\ \ .25} \]


    Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. 2010 SENTIWORDNET 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. LREC-2010

Disagreements between polarity lexicons

Analyzing the polarity of each word in IMDB

Potts, Christopher. 2011. On the negativity of negation. SALT 20, 636-659.

  • How likely is each word to appear in each sentiment class?
  • Count(“bad”) in 1-star, 2-star, 3-star, etc.
  • But can’t use raw counts:
  • Instead, likelihood: \(P(w|c) = \frac{f(w,c)}{\sum_{w \in c}{f(w,c)}}\)
  • Make them comparable between words
    • Scaled likelihood: \(\frac{P(w|c)}{P(w)}\)

Analyzing the polarity of each word in IMDB

Potts, Christopher. 2011. On the negativity of negation. SALT 20, 636-659.

Other sentiment feature: Logical negation

Potts, Christopher. 2011. On the negativity of negation. SALT 20, 636-659.

  • Is logical negation (no, not) associated with negative sentiment?

  • Potts experiment:
    • Count negation (not, n’t, no, never) in online reviews
    • Regress against the review rating

Potts 2011 Results:
More negation in negative sentiment

Semi-supervised learning of lexicons

  • Use a small amount of information
    • A few labeled examples
    • A few hand-built patterns
  • To bootstrap a lexicon

Hatzivassiloglou and McKeown intuition for identifying word polarity

Vasileios Hatzivassiloglou and Kathleen R. McKeown. 1997. Predicting the Semantic Orientation of Adjectives. ACL, 174–181

  • Adjectives conjoined by “and” have same polarity
    • Fair and legitimate, corrupt and brutal
    • *fair and brutal, *corrupt and legitimate
  • Adjectives conjoined by “but” do not
    • fair but brutal

Hatzivassiloglou & McKeown 1997
Step 1

  • Label seed set of 1336 adjectives (all >20 in 21 million word WSJ corpus)
    • 657 positive
      • adequate central clever famous intelligent remarkable reputed sensitive slender thriving…
    • 679 negative
      • contagious drunken ignorant lanky listless primitive strident troublesome unresolved unsuspecting…

Hatzivassiloglou & McKeown 1997
Step 2

  • Expand seed set to conjoined adjectives

Hatzivassiloglou & McKeown 1997
Step 3

  • Supervised classifier assigns “polarity similarity” to each word pair, resulting in graph:

Hatzivassiloglou & McKeown 1997
Step 4

  • Clustering for partitioning the graph into two

Output polarity lexicon

  • Positive
    • bold decisive disturbing generous good honest important large mature patient peaceful positive proud sound stimulating straightforward strange talented vigorous witty…
  • Negative
    • ambiguous cautious cynical evasive harmful hypocritical inefficient insecure irrational irresponsible minor outspoken pleasant reckless risky selfish tedious unsupported vulnerable wasteful…

Turney Algorithm

Potts, Christopher. 2011. On the negativity of negation. SALT 20, 636-659.

  1. Extract a phrasal lexicon from reviews
  2. Learn polarity of each phrase
  3. Rate a review by the average polarity of its phrases

Extract two-word phrases with adjectives

How to measure polarity of a phrase?

  • Positive phrases co-occur more with “excellent”

  • Negative phrases co-occur more with “poor”

  • But how to measure co-occurrence?

Pointwise Mutual Information

  • Mutual information between 2 random variables X and Y

\[I(X,Y) = \sum_X \sum_Y{P(x,y)log_2{\frac{P(x,y)}{P(x)P(y)}}}\]

  • Pointwise mutual information:
    • How much more do events x and y co-occur than if they were independent?

\[PMI(X,Y)=log_2{\frac{P(x,y)}{P(x)P(y)}}\]

Pointwise Mutual Information

  • Pointwise mutual information:
    • How much more do events x and y co-occur than if they were independent?

\[PMI(X,Y)=log_2{\frac{P(x,y)}{P(x)P(y)}}\]

  • PMI between two words:
    • How much more do two words co-occur than if they were independent?

\[PMI(word_1,woprd_2)=log_2{\frac{P(word_1,word_2)}{P(word_1)P(word_2)}}\]

How to Estimate Pointwise Mutual Information

  • Query search engine (Altavista)
    • P(word) estimated by      hits(word)/N
    • P(word1,word2) by     hits(word1 NEAR word2)/N^2

\[PMI(word_1,woprd_2)=log_2{\frac{hits(word_1 \: \mathrm{NEAR} \: word_2)}{hits(word_1)hits(word_2)}}\]

Does phrase appear more with “poor” or “excellent”?

\[ \begin{align} \mathrm{Polarity}(phrase) = \mathrm{PMI}(pharse, \mathrm{"excellent"}) - \mathrm{PMI}(pharse, \mathrm{"poor"}) \\ \\ = log_2{\frac{hits(phrase \: \mathrm{NEAR} \: \mathrm{"excellent"})}{hits(phrase)hits(\mathrm{"excellent"})}} - log_2{\frac{hits(phrase \: \mathrm{NEAR} \: \mathrm{"poor"})}{hits(phrase)hits(\mathrm{"poor"})}} \\ \\ = log_2{\frac{hits(phrase \: \mathrm{NEAR} \: \mathrm{"excellent"})}{hits(phrase)hits(\mathrm{"excellent"})}} {\frac{hits(phrase)hits(\mathrm{"poor"})}{hits(phrase \: \mathrm{NEAR} \: \mathrm{"poor"})}} \\ \\ = log_2{(\frac{hits(phrase \: \mathrm{NEAR} \: \mathrm{"excellent"}) hits(\mathrm{"poor"})}{hits(phrase \: \mathrm{NEAR} \: \mathrm{"poor"}) hits(\mathrm{"excellent"})})} \end{align} \]

Phrases from a thumbs-up review

Phrase POS.tags Polarity
online service JJ NN 2.8
online experience JJ NN 2.3
direct deposit JJ NN 1.3
local branch JJ NN 0.42
low fees JJ NNS 0.33
true service JJ NN -0.73
other bank JJ NN -0.85
inconveniently located JJ NN -1.5
Average 0.32

Phrases from a thumbs-down review

Phrase POS.tags Polarity
direct deposits JJ NNS 5.8
online web JJ NN 1.9
very handy RB JJ 1.4
virtual monopoly JJ NN -2
lesser evil RBR JJ -2.3
other problems JJ NNS -2.8
low funds JJ NNS -6.8
unethical practices JJ NNS -8.5
Average -1.2

Results of Turney algorithm

  • 410 reviews from Epinions
    • 170 (41%) negative
    • 240 (59%) positive
  • Majority class baseline: 59%
  • Turney algorithm: 74%

  • Phrases rather than words
  • Learns domain-specific information

Using WordNet to learn polarity

S.M. Kim and E. Hovy. 2004. Determining the sentiment of opinions. COLING 2004
M. Hu and B. Liu. Mining and summarizing customer reviews. In Proceedings of KDD, 2004

  • WordNet: online thesaurus (covered in later lecture).
  • Create positive (“good”) and negative seed-words (“terrible”)
  • Find Synonyms and Antonyms
    • Positive Set: Add synonyms of positive words (“well”) and antonyms of negative words
    • Negative Set: Add synonyms of negative words (“awful”) and antonyms of positive words (”evil”)
  • Repeat, following chains of synonyms
  • Filter

Summary on Learning Lexicons

  • Advantages:
    • Can be domain-specific
    • Can be more robust (more words)
  • Intuition
    • Start with a seed set of words (‘good’, ‘poor’)
    • Find other words that have similar polarity:
      • Using “and” and “but”
      • Using words that occur nearby in the same document
      • Using WordNet synonyms and antonyms

Supervised methods

Document sentiment classification

  • Classify a whole opinion document (e.g., a review) based on the overall sentiment of the opinion holder (Pang et al 2002; Turney 2002)
    • Classes: Positive, negative (possibly neutral)
  • An example review:
    • “I bought an iPhone a few days ago. It is such a nice phone, although a little large. The touch screen is cool. The voice quality is great too. I simply love it!”
    • Classification: positive or negative?
  • It is basically a text classification problem

Sentence sentiment analysis

  • Classify the sentiment expressed in a sentence
    • Classes: positive, negative, neutral
    • Neutral means no sentiment expressed
      • “I believe he went home yesterday.”
      • “I bought a iPhone yesterday”
  • But bear in mind
    • Explicit opinion: “I like this car.”
    • Fact-implied opinion: “I bought this car yesterday and it broke today.”
    • Mixed opinion: “Apple is doing well in this poor economy”

Features for supervised learning

  • The problem has been studied by numerous researchers.

  • Key: feature engineering. A large set of features have been tried by researchers. E.g.,
    • Terms frequency and different IR weighting schemes
    • Part of speech (POS) tags
    • Opinion words and phrases
    • Negations
    • Syntactic dependency

Approaches

  • Machine learning
    • Naïve Bayes (Assume pairwise independent features)

    • Maximum Entropy Classifier (Assume pairwise independent features)

    • SVM

    • Markov Blanket Classifier
      • Accounts for conditional feature dependencies
      • Allowed reduction of discriminating features from thousands of words to about 20 (movie review domain)

Sentiment Classification in Movie Reviews

Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP-2002, 79—86.
Bo Pang and Lillian Lee. 2004. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. ACL, 271-278


Baseline Algorithm (adapted from Pang and Lee)

  • Tokenization

  • Feature Extraction

  • Classification using different classifiers
    • Naïve Bayes
    • MaxEnt
    • SVM

Sentiment Tokenization Issues

Extracting Features for Sentiment Classification

  • How to handle negation

    • I didn’t like this movie
      vs
    • I really like this movie
  • Which words to use?

    • Only adjectives

    • All words

      • All words turns out to work better, at least on this data

Negation

Das, Sanjiv and Mike Chen. 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. In Proceedings of the Asia Pacific Finance Association Annual Conference (APFA).
Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP-2002, 79—86.



Add NOT_ to every word between negation and following punctuation:

Reminder: Naïve Bayes

\[C_{NB} = \underset{c_j \in C}{\operatorname{argmax}}P(c_j) \prod_{i \in positions}{P(w_i|c_i)} \]

\[\hat{P}(w|c) = \frac{count(w,c) + 1}{count(c) + |V|}\]

Cross-Validation

  • Break up data into 10 folds

    • (Equal positive and negative inside each fold?)
  • For each fold

    • Choose the fold as a temporary test set

    • Train on 9 folds, compute performance on the test fold

  • Report average performance of the 10 runs

Supervised Sentiment Analysis

  • Negation is important

  • Using all words (in naïve bayes) works well for some tasks

  • Finding subsets of words may help in other tasks
    • Hand-built polarity lexicons
    • Use seeds and semi-supervised learning to induce lexicons

Other challenges in SA

Explicit and implicit aspects

(Hu and Liu, 2004)

  • Explicit aspects: Aspects explicitly mentioned as nouns or noun phrases in a sentence

    • “The picture quality is of this phone is great.”
  • Implicit aspects: Aspects not explicitly mentioned in a sentence but are implied
    • “This car is so expensive.”
    • “This phone will not easily fit in a pocket.”
    • “Included 16MB is stingy.”
  • Some work has been done (Su et al. 2009; Hai et al 2011)

Explicit Opinions

Bagheri et al. 2013

Some interesting sentences

  • Trying out Chrome because Firefox keeps crashing.”

    • Firefox - negative; no opinion about chrome.

    • We need to segment the sentence into clauses to decide that “crashing” only applies to Firefox(?).

  • But how about these
    • I changed to Audi because BMW is so expensive.”

    • I did not buy BWM because of the high price.”

    • I am so happy that my iPhone is nothing like my old ugly Droid.”

Some interesting sentences (contd)

  • These two sentences are from paint reviews.

    • For paintX, one coat can cover the wood color.”

    • For paintY, we need three coats to cover the wood color

    • We know that paintX is good and paintY is not, but how, by a system.

  • “My goal is to get a tv with good picture quality”

  • “The top of the picture was brighter than the bottom.”

  • “When I first got the airbed a couple of weeks ago it was wonderful as all new things are, however as the weeks progressed I liked it less and less.”

Some interesting sentences (contd)

  • Conditional sentences are hard to deal with (Narayanan et al. 2009)

    • If I can find a good camera, I will buy it.”

    • But conditional sentences can have opinions

      • If you are looking for a good phone, buy Nokia
  • Questions are also hard to handle

    • Are there any great perks for employees?”

    • Any idea how to fix this lousy Sony camera?”

Some interesting sentences (contd)

  • Sarcastic sentences

    • What a great car, it stopped working in the second day.”
  • Sarcastic sentences are common in political blogs, comments and discussions.

    • They make political opinions difficult to handle

Multiclass and Multilabel Classification

Multi-class classification

  • Sentiment: Positive, Negative, Neutral

  • Emotion: angry, sad, joyful, fearful, ashamed, proud, elated

  • Disease: Healthy, Cold, Flu

  • Weather: Sunny, Cloudy, Rain, Snow

One-vs-all (one-vs-rest)

One-vs-all

  • While some classification algorithms naturally permit the use of more than two classes and/or labels, others are by nature binary algorithms; these can, however, be turned into multinomial classifiers by a variety of strategies.

  • A common strategy is one-vs-all, which involves training a single classifier per class, with the samples of that class as positive samples and all other samples as negatives.

One-vs-all

  • Train a logistic regression classifier \(h_\theta^{(i)}(x)\) for each class \(i\) to predict the probability that \(y=i\)

  • Given a new input \(x\), pick the class \(i\) that maximizes

\[\max_i{h_\theta^{(i)}(x)}\]

Generative Approach



Ex: Naïve Bayes

Estimate \(P(Y)\) and \(P(X|Y)\)




Prediction

\[\hat{y} = \underset{y}{\operatorname{argmax}}P(Y = y)P(X = x|Y = y)\]

Discriminative Approach



Ex: Logistic regression

Estimate \(P(Y|X)\) directly
(Or a discriminant function: e.g., SVM)



Prediction

\[\hat{y} = P(Y = y|X = x)\]

Classification

  • Multiclass classification is the task of classifying instances into one of three or more classes. Classifying instances into one of two classes is called binary classification. Multiclass classification should not be confused with multi-label classification, where multiple labels are to be predicted for each instance.

Multi-label classification

  • In multiclass, one-vs-all requires the base classifiers to produce a real-valued score for its decision, rather than just a class label. Then, the final label is the one corresponding to the class with the highest score.

  • In multilabel, this strategy predicts all labels for this sample for which the respective classifiers predict a positive result.

Summary

Summary: what did we learn?

  • Sentiment Analysis

  • Multiclass and Multi-label classification

Practical 4